Amiga Tools 3

home *** CD-ROM | disk | FTP | other *** search

/ Amiga Tools 3 / Amiga Tools 3.iso / grafik / raytracing / rayshade-4.0.6.3 / inetray / poo < prev next >

Wrap

Text File | 1993-08-15 | 16.5 KB | 329 lines

====================================================================== P O O doc: Thu Apr 2 11:59:51 1992 dlm: Wed Jul 21 14:37:58 1993 (c) 1992 ant@ips.id.ethz.ch uE-Info: 266 0 NIL 0 0 72 3 2 8 ofnI ====================================================================== This file describes some interna of the inetray-packet. It's name derives from Principles of Operation. Overview -------- The program inetray is responsible for dispatching and scheduling the rayshade requests. In the usual terminology it acts as the client requesting services from a number of remotely running servers. It does that using SUN RPC. Rendering requests do not block, therefore inetray also listens continuously on a socket to check for incoming results. The data which is received from the workers is written to the file whenever this is possible. The program rpc.inetrayd serves two purposes: it services a number of rpc-requests dealing with initialization and management. Whenever it receives a rendering request, it spawns of a worker child and continues to service rpc requests (a restricted number). The worker now renders a part of a frame and then directly contacts the dispatcher to send it the result. This is done using a XDR/TCP connection. inetray.start is a simple RPC daemon servicing requests for starting the rpc.inetrayd servers. Rayshade Libraries ------------------ Inetray uses the standard Rayshade libraries (as used in version 4.0.6). Great care has been taken to avoid having to change the libraries at all. As long as the interface stays the same no change is required for Inetray even if rayshade evolves. At times this decision not to change the rayshade source lead to complicated and maybe clumsy solutions. But in the interest of portability it has nevertheless been adhered to strictly. There is one currently unsolved problem arising from this decision: it seems that the random generator used for textures is different on big and little-endian machines. Therefore they don't mix. A solution has been promised for rayshade 5 by Craig Kolb but so far it isn't clear when it'll be out. Input ----- rayshade accepts its input in various ways: 1) rayshade "filename" (input is file) 2) ... | rayshade (input is stdin/pipe) 3) rayshade < "filename" (input is stdin/file) 4) rayshade (input is stdin/keyboard) Inetray from version 1.1.0 on provides total compatibility with all those possibilities. It does this by buffering stdin in a live buffer (see below). RSInitialize() is then called taking its stdin from this buffer. Note that in case 1 the buffer is not needed and should indeed disappear because it competes with inetray for the keyboard. This problem is solved in a very simple manner: on buffer startup SIGINT is set to kill the buffer; once the buffer encounters an eof (i.e. cases 2 3 & 4, where eof is necessarily encountered before RSInitialize() returns) SIGINT is ignored. Inetray (the parent) sends a SIGINT to the buffer on return from RSInitialize. If the buffer still has the stdin open (necessarily the keyboard) it is killed, otherwise it continues running. Case 1 was the only one allowed for Inetray up to version 1.0.1. It requires the input file to exist on all worker-machines. To increase the flexibility, the inetray workers try to have the ``same'' working-directory as the dispatcher (see section Pathnames below). Note that even when stdin is used for input, the input can contain references to other files to be read in, namely cpp #includes and height-fields. Those files must be accessed much in the same way as the files in case 1 (see above and section Pathnames). If no such files must be included then no file has to be accessible on the worker machines. Case 4 is handled much like cases 2 & 3. Live Buffers ------------ A live buffer is just a forked process which first reads from one filedesc into memory (malloc'ed) and then writes the contents of the buffer to another (in some cases two) filedesc before terminating. End of input is detected when either an eof is reached or a \0 is read as the last character of a read() syscall. This feature allows to use live buffers to read from TCP connections which should not be closed. Live buffers are more expensive than writing a temporary file for large amounts of data and worsen the already problematic memory situation but they avoid having servers writing files which could be a possible security problem (see below). Note that live buffers terminate automatically once their respective parents disappeared. This is due to the fact that eventually they will encounter an eof on the input filedesc and start writing. They always write to a pipe so when the last reader of the pipe died they die on a SIGPIPE. Authentication & Security ------------------------- If the servers (rpc.inetrayd) are started as root, they try to change to the user id supplied to them. This is usually the user id of the user running the dispatcher (inetray). Any user, however can set a different user id for servers started by inetray.start. No server can run as root (uid == 0). If the uid is illegal on the server it exits with an error message in the syslog. No server ever produces an output file. This therefore limits the security concerns to changing the access time of files. Of course it is possible that there are loopholes in this concept; I just haven't found one yet. If the server is not started as root, it will continue to run under the uid it was started as. One has to check the permissions of the accessed files for reading access for that user. The actual usernames under which the servers are running is diplayed by both inetray and inetray.ping. Session Keys ------------ Whenever a started server receives the first request, a session key is sent with that. Once a sessions key is installed, only requests with the same key are serviced. In practice this means that only the person who issued a inetray call can kill the running servers and workers. The key is stored in the file .inetray.key in the current directory where inetray was issued. An eventually existing file is renamed to .inetray.key.old. inetray displays the current session key on startup. inetray.ping uses the special key 0. Therefore, if servers hang after a inetray.ping, they can be killed with inetray.kill 0. The program inetray.kill needs a session key supplied with. If one is given as an argument, this takes precedence. If no key is supplied, inetray.kill looks for one in the file .inetray.key. Version Numbers --------------- Both inetray and rpc.inetrayd know about their version number. The server passes this back to inetray and inetray.ping upon reception of the first request. Only if the first character (i.e. mayor version number) of this version number matches, the worker is accepted. Pathnames --------- Since servers can be running on machines with totally different filesystems but may need to access the inputfiles locally, some pathname substitution is supported. All filenames are transferred as-is to the servers/workers. If they start with a / they are absolute path-names starting at the respective root of the machines. This will probably not work well on all but the most homogenous networks. If they don't start with a / they are relative names starting in the current working directory. From the working directory where the client is started the home-part is stripped if possible. This stripped directory is then sent to the server which in turn adds the home directory of the uid it is to run as. Note that if nothing was stripped on the client-side, then nothing is added on the server-side. Note also that the right directory is chosen even when the server cannot run under that user id. The server tries to chdir to the directory so constructed. If that fails it continues to run in the current directory which is the directory where it was started from. The working directories of the servers are displayed by both inetray and inetray.ping. The practical abshot is that if you have the same sub-directory structure below your home on the different machines, you can start Inetray in all these directories and the servers/workers will cwd() to the right sub-dirs as well. Port Numbers ------------ The rendered portions of a frame are sent back using a XDR/TCP connection. The portnumber for this is defined in config.h (RESULTPORT) but can be overridden for each user in the .inetrayrc file. Registering Servers ------------------- Whenever inetray or inetray.ping are started, they try to register ready servers. First, the servers started by inetray.start are started; the servers started by inetd are started automatically when an INIT-request arrives. The order in which the machines are contacted is the following: 1: All simple hosts given in the Use List (if any) 2: All directed broadcasts addresses in the Use List (if any) 3: The Local Network (if option N=0 is not set in the Use List) After starting, an INIT-request is sent to all machines. Servers that are to be started by inetd, are started automatically when they receive an INIT-request. The same order applies. Servers reply by opening a TCP-connection on the result-port and sending back status info. Answers may be ignored for two reasons: either the hostname appears (exactly as given) in the ignore list in the current .inetrayrc or the mayor version number of the server does not match that of the dispatcher. If the input comes from stdin, then the contents of the live buffer (see above) of the dispatcher is sent to live buffers on the server machines using the same TCP-connection. This is, however, only done once registering is otherwise completed (i.e. the list of registered machines is complete). Work Scheduling --------------- A frame is divided into blocks encompassing > 1 lines. This is done according to a simple heuristics the parameters of which can be controlled by editing config.h and/or overriding those values in a .inetrayrc file (see INSTALL/Appendix B for details). After n workers have been registered, the block size is calculated as follows: blockSize = ySize / blocksPerServer / n. After that, the size is checked against the lower and upper limit (MINBLOCKSIZE resp. MAXBLOCKSIZE). If it exceeds a limit, it is adjusted accordingly. After that, the size of the last, possibly incomplete, block is calculated and the information printed. In early versions (up to [0.2.0]), a simple round robin scheduling has been used: subseqent machines got subsequent blocks to trace; whenever the end of a frame was reached, the whole process started over with only the non-terminated blocks. This could lead to quite bad behaviour in the end. Consider for example the example file mole.ray. Early blocks (bottom half) take much longer to trace than later ones. If now one machine is heavily loaded, it won't ever complete its block. This means that there will one early block be outstanding for a very long time wich will inhibit concurrent writing. Furthermore, with a little bit of bad luck, this block will be the last one outstanding which will mean that a lot of machines will calculate just one block in the end. This block will take a long time to calculate. Starting with version [0.2.1] there is a rescheduling inserted in the middle of a frame. The number of machines which did not yet return a result is counted and the first n blocks (n being the number of those machines) not yet calculated are given priority over other blocks. These blocks are exactly those residing on those slow machines. Hopefully, these are distributed to faster machines like this. I my setting, this modification lead to quite a decrease in time needed to complete the last block. Notes: - The scheme presented here also works nicely if workers crash during the first half of a frame (which they seem to tend to do). For version 2.0.0 the scheduling has changed yet again. For images where all the hard work is done in a small part of the picture the old scheduler didn't work very nicely. To solve this problem the following scheduler has been implemented: - During the first round of work scheduling (i.e. until all blocks have been dispatched once) the blocks are always scheduled in pairs (i.e. one woker renders 2 blocks on every request). - When this 1st pass has been finished, only single blocks are dispatched. It's not clear if this scheduler is always better than the earlier versions. Concurrent Servers & RPC Program Numbers ---------------------------------------- It is possible for one machine to have more than one server (and worker) running at a time. This feature is implemented to allow multiprocessor machines to have as many workers as processors running. A machine starting more than one worker cannot start it using inetd. Concurrent servers have different RPC Program Numbers. The first server gets the program number IRNUM defined in prognum.h. Subsequent servers get subsequent program numbers. Like that, registering with the portmapper works correctly. It must be noted, though, that all broadcasts to servers now must be broadcast for all program numbers. Error Logging ------------- The general mechanism is described in README and SUPPORT. Please note that also all errors produced by the rayshade routines are logged. This is done using a funny redirection of the stderr to the syslog using socketpairs and async I/O. For this to work under AUX I had to implement the socketpair() syscall there, since the one built in does not work (at least in our version). Error Termination ----------------- Roughly once every minute, every server checks if the dispatcher is still running. If that's not the case, it kills it's associated worker if it has one and then exits with an entry in the syslog. As from version 2.0.0 the server also checks the exit status of its child once every minute. If the child exited with a status != 0 it shuts itself down. This non-zero exit status can be due to two different reasons: either the rayshade libraries exited explicitly or the worker was terminated with a signal (either implicitly (bus error, segmentation violation, ...) or explicitly (it annoyed either your sysadm or yourself)). Socket State ------------ Seems to me there's no clean way to extract the correct state of a socket without reading kernel memory. Nevertheless, the connection state must be retrieved for checking the state of the dispatcher. In a first test getpeername() was used. Unfortunately it returns the peername of the dispatcher even if that one has been killed (and the socked is in CLOSE_WAIT/FIN_WAIT_2 state). Up to version 1.0.1 select()'ing the socket for reading did the trick since it was used only as a one-way server->dispatcher connection. Thus being ready for reading meant an error. Later versions use the TCP connection to send the stdin (see Live Buffers above). Therefore checking the state of the connection means selecting it for read and testing it being empty at the same time. There's no UNIX syscall to do this. If it can be guaranteed that nobody reads the socket between selecting it for read and testing it for emptyness then we succeeded. Unfortunately there is a Live Buffer which reads the data written to the socket; this buffer is a separate process which does not sync itself with the server. The buffer can, however, not block forever in its reading state. It will stop reading if the buffer on the dispatcher side is exhausted or killed. After that it will start writing on the pipe. Therefore we can disallow checking the dispatcher for life while the server buffer is in reading state. It enters reading state immetiately when lPostBuffer() is called. By selecting the pipe for reading we can find out when the buffer is its writing state. Note that the socket is never written to (by the dispatcher) unless an INIT request has been successfully completed by the server. Therefore we don't even have to check for emptyness of the socket - selecting it for read whenever we can guarantee that the buffer is not reading it tells us therefore if the dispatcher is still running. If the buffer on the server side is killed before completing its reading then the server also terminates assuming the death of the dispatcher. This is ok. If the buffer dies during its writing period, the pipe is closed and returns eof for the reader which results in an error and exit there. Rpc.inetrayd startup -------------------- The server can be started up by inetd or inetray.start (or, for debugging purposes, by hand). It checks its number of arguments to decide how it was started up. If it is called without any arguments, it assumes that it is started by inetd. Therefore you have to supply a dummy argument if you want to start it by hand.